Research Article Methods for Stratified Cluster Sampling with Informative Stratification
نویسندگان
چکیده
We look at fitting regression models using data from stratified cluster samples when the strata may depend in some way on the observed responses within clusters. One important subclass of examples is that of family studies in genetic epidemiology, where the probability of selecting a family into the study depends on the incidence of disease within the family. We develop the survey-weighted estimating equation approach for this problem, with particular emphasis on the estimation of superpopulation parameters. Full maximum likelihood for this class of problems involves modelling the population distribution of the covariates which is simply not feasible when there are a large number of potential covariates. We discuss efficient semiparametric maximum likelihood methods in which the covariate distribution is left completely unspecified. We further discuss the relative efficiencies of these two approaches.
منابع مشابه
Methods for Stratified Cluster Sampling with Informative Stratification
We look at fitting regression models using data from stratified cluster samples when the strata may depend in some way on the observed responses within clusters. One important subclass of examples is that of family studies in genetic epidemiology, where the probability of selecting a family into the study depends on the incidence of disease within the family. We develop the survey-weighted esti...
متن کاملAre Survey Weights Necessary? the Maximum Likelihood Approach to Sample Survey Inference
In the present work we explicate the application of maximum likelihood inference in the analysis of surveys which are the result of (possibly informative) stratified sampling. In Section 1 we review basic ideas, including two general results useful for applying maximum likelihood to sample data. Ideas are illustrated by a simple through the origin regression model. In Section 2, we discuss the ...
متن کاملStratified Sampling Design Based on Data Mining
OBJECTIVES To explore classification rules based on data mining methodologies which are to be used in defining strata in stratified sampling of healthcare providers with improved sampling efficiency. METHODS We performed k-means clustering to group providers with similar characteristics, then, constructed decision trees on cluster labels to generate stratification rules. We assessed the varia...
متن کاملComparison of Clustering Methods over a Hidden Web Data using Stratification
This paper’s centre of attention is on the problem of data mining (in general) and clustering (in specific) on a hidden web data. We know that data mining is a process that analyzes and extracts knowledge from large amounts of data which provides useful information to users. Hidden or deep web data is the database located at remote system .So, to access such data, we need query interface or HTM...
متن کاملAn Evaluation of Stratified Sampling of Microarchitecture Simulations
Recent research advocates applying sampling to accelerate microarchitecture simulation. Simple random sampling offers accurate performance estimates (with a high quantifiable confidence) by taking a large number (e.g., 10,000) of short performance measurements over the full length of a benchmark. Simple random sampling does not exploit the often repetitive behaviors of benchmarks, collecting ma...
متن کامل